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ALPHA- 1 ,4-GLUC AN LYASE FROM A FUNGUS INFECTED ALGAE, ITS PURIFICATION, GENE 
CLONING AND EXPRESSION IN MICROORGANISMS 

The present invention relates to an enzyme, in particular a-l,4-glucan lyase ("GL"). 
The present invention also relates to a method of extracting the same. The present 
invention also relates to nucleotide sequence(s) encoding for the same. 

FR-A-2617502 and Bauteet al in Phytochemistry [1988] vol. 27 No. 11 pp3401-3403 
report on the production of 1,5-D-anhydrofructose ("AF") in Morchella vulgaris by 
an apparent enzymatic reaction. The yield of production of AF is quite low. Despite 
a reference to a possible enymatic reaction, neither of these two documents presents 
any amino acid sequence data for any enzyme, let alone any nucleotide sequence 
information. These documents say that AF can be a precursor for the preparation of 
the antibiotic pyrone microthecin. 

Yu et al in Biochimica et Biophysica Acta [1993] vol 1156 pp3 13-320 report on the 
preparation of GL from red seaweed and its use to degrade c*-l,4-glucan to produce 
AF. The yield of production of AF is quite low. Despite a reference to the enzyme 
GL this document does not present any amino acid sequence data for that enzyme let 
alone any nucleotide sequence information coding for the same. This document also 
suggests that the source of GL is just algal. 

According to the present invention there is provided a method of preparing the 
enzyme a-l,4-glucan lyase comprising isolating the enzyme from a fungally infected 
algae. 

Preferably the enzyme is isolated and/or further purified using a gel that is not 
degraded by the enzyme. 

Preferably the gel is based on dextrin, preferably beta-cyclodextrin, or derivatives 
thereof, preferably a cyclodextrin, more preferably beta-cyclo-dextrin. 
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According to the present invention there is also provided a GL enzyme prepared by 
the method of the present invention. 

Preferably the enzyme comprises the amino acid sequence SEQ. ID. No. 1. or SEQ. 
ID. No. 2, or any variant thereof. 

The term "any variant thereof means any substitution of, variation of, modification 
of, replacement of, deletion of or addition of at least one amino acid from or to the 
sequence providing the resultant enzyme has lyase activity. 

According to the present invention there is also provided a nucleotide sequence coding 
for the enzyme a-l,4-glucan lyase, preferably wherein the sequence is not in its 
natural enviroment (i.e. does not form part of the natural genome of a cellular 
organism expressing the enzyme). 

Preferably the nucleotide sequence is a DNA sequence. 

Preferably the DNA sequence comprises a sequence that is the same as, or is 
complementary to, or has substantial homology with, or contains any suitable codon 
substitution(s) for any of those of, SEQ. ID. No. 3 or SEQ. ID. No. 4. 

The expression "substantial homology" covers homology with respect to structure 
and/or nucleotide components and/or biological activity. 

The expression "contains any suitable codon substitutions" covers any codon 
replacement or substitution with another codon coding for the same amino acid or any 
addition or removal thereof providing the resultant enzyme has lyase activity. 

In other words, the present invention also covers a modified DNA sequence in which 
at least one nucleotide has been deleted, substituted or modified or in which at least 
one additional nucleotide has been inserted so as to encode a polypeptide having the 
activity of a glucan lyase, preferably an enzyme having an increased lyase activity. 
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According to the present invention there is also provided a method of preparing the 
enzyme or-l,4-glucan lyase comprising expressing the nucleotide sequence of the 
present invention. 

According to the present invention there is also provided the use of beta-cyclodextrin 
to purify an enzyme, preferably GL. 

According to the present invention there is also provided a nucleotide sequence 
wherein the DNA sequence comprises a sequence that is the same as, or is 
complementary to, or has substantial homology with, or contains any suitable codon 
substitutions for any of those of, SEQ. ID. No. 3 or SEQ. ID. No. 4, preferably 
wherein the sequence is in isolated form. 

A key aspect of the present invention is the recognition that GL is derived from a 
fungally infected algae. This is the first time that the amino acid sequence of GL has 
been determined in addition to the determination of the nucleic acid sequences that 
code for GL. A key advantage of the present invention is therefore that GL can now 
be made in large quantities by for example recombinant DNA techniques and thus 
enable compounds such as the antibiotic microthecin to be made easily and in larger 
amounts. 

The enzyme should preferably be secreted to ease its purification. To do so the DNA 
encoding the mature enzyme is fused to a signal sequence, a promoter and a 
terminator from the chosen host. 

For expression in Aspergillus niger the gpdA (from the Glyceraldehyde-3-phosphate 
dehydrogenase gene of Aspergillus nidulans) promoter and signal sequence is fused 
to the 5' end of the DNA encoding the mature lyase - such as SEQ LD. No. 3 or 
SEQ. LD. No.4. The terminator sequence from the A. niger trpC gene is placed 3' 
to the gene (Punt, PJ. et al (1991): J. Biotech. 17, 19-34). This construction is 
inserted into a vector containing a replication origin and selection origin for E. coli 
and a selection marker for A. niger. Examples of selection markers for A. niger are 
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the amdS gene, the argB gene, the pyrG gene, the hygB gene, the BmlR gene which 
all have been used for selection of transformants. This plasmid can be transformed 
into A. niger and the mature lyase can be recovered from the culture medium of the 
transformants. 

5 

The construction can be transformed into a protease deficient strain to reduce the 
proteolytic degradation of the lyase in the culture medium (Archer D.B. et al (1992): 
BiotechnoL Lett. 14, 357-362). 

10 Other advantages will become apparent in the light of the following description. 

The present invention therefore relates to the isolation of the enzyme <*-l,4-glucan 
lyase from a fungus infected algae - preferably a fungus infected red algae such as the 
type that can be collected in China - such as Gracilariopsis lemaneiformis . An 
15 example of a fungally infected algae has been deposited in accordance with the 

Budapest Treaty (see below). 

By using in situ hybridisation technique it was established that the enzyme GL was 
detected in the fungally infected red algae Gracilariopsis lemaneiformis. Further 
20 evidence that supports this observation was provided by the results of Southern 

hybridisation experiments. Thus GL enzyme activity can be obtained from fungally 
infected algae, rather than just from the algae as was originally thought. 

Of particular interest is the finding that there are two natural DNA sequences, each 
25 of which codes for an enzyme having GL characteristics. These DNA nucleic acid 

sequences have been sequenced and they are presented as SEQ. I.D. No. 3 and SEQ. 
LD. No. 4 (which are discussed and presented later). 

An initial enzyme purification can be performed by the method as described by Yu 
30 et al (ibid). However, it is preferred that the initial enzyme purification includes the 

use of a solid support that does not decompose under the purification step. This gel 
support has the advantage that it is compatible with standard laboratory protein 
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purification equipment. The details of this preferred purification process are given 
later on. The purification is terminated by known standard techniques for protein 
purification. The purity of the enzyme was established using complementary 
electroforetic techniques. 

5 

The purified lyase was characterized according to pi, temperature- and pH-optima. 
In this regard, it was found that the enzyme has the following characteristics: an 
optimium substrate specificity and a pH optimum at 3.5-7.5 when amylopectin is 
used; a temperature optimum at 50°C and a pi of 3.9 

10 

As mentioned above, the enzymes according to the present invention have been 
determined (partially by amino-acid sequencing techniques) and their amino acid 
sequences are provided later. Likewise the nucleotide sequences coding for the 
enzymes according to the present invention (i.e. GL) have been sequenced and the 
15 DNA sequences are provided later. 

The following samples were deposited in accordance with the Budapest Treaty at the 
recognised depositary The National Collections of Industrial and Marine Bacteria 
Limited (NCIMB) at 23 St. Machar Drive, Aberdeen, Scotland, United Kingdom, 
20 AB2 1RY on 20 June 1994: 

E.Coli containing plasmid pGLl (NCIMB 40652) - [ref. DH5alpha-pGLl] ; and 

E.Coli containing plasmid pGL2 (NCIMB 40653) - [ref. DH5alpha-pGL2] . 

25 

The following sample was accepted as a deposit in accordance with the Budapest 
Treaty at the recognised depositary The Culture Collection of Algae and Protozoa 
(CCAP) at Dunstaffnage Marine Laboratory PO Box 3, Oban, Argyll, Scotland, 
United Kingdom, PA34 4 AD on 11 October 1994: 

30 

Fungally infected Gracilariopsis lemaneiformis (CCAP 1373/1) - [ref. GLQ-1 
(Qingdao)]. 
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Thus highly preferred embodiments of the present invention include a GL enzyme 
obtainable from the expression of the GL coding sequences present in plasmids that 
are the subject of either deposit NCIMB 40652 or deposit NCIMB 40653; and a GL 
enzyme obtainable from the fungally infected algae that is the subject of deposit 
5 CC AP 1373/1. 

The present invention will now be described only by way of example. 

In the following Examples reference is made to the accompanying figures in which: 

10 

Figure 1 shows stained fungally infected algae; 
Figure 2 shows stained fungally infected algae; 
15 Figure 3 shows sections of fungal hypha; 

Figure 4 shows sections of fungally infected algae; 
Figure 5 shows a section of fungally infected algae; 

20 

Figure 6 shows a plasmid map of pGLl; 

Figure 7 shows a plasmid map of pGL2; 

25 Figure 8 shows the amino acid sequence represented as SEQ. I.D. No.3 showing 

positions of the peptide fragments that were sequenced;. 

Figure 9 shows the alignment of SEQ. I.D. No. 1 with SEQ. I.D. No.2; 

30 Figure 10 is a microphotograph. 
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In more detail, Figure 1 shows Calcoflour White stainings revealing fungi in upper 
part and lower part of Gracilariopsis lemaneiformis (108x and 294x). 

Figure 2 shows PAS/Anilinblue Black staining of Gracilariopsis lemaneiformis with 
fungi. The fungi have a significant higher content of carbohydrates. 

Figure 3 shows a micrograph showing longitudinal and grazing sections of two thin- 
walled fungal hypha (f) growing between thick walls (w) of algal cells. Note 
thylacoid membranes in the algal chloroplast (arrows). 

Figure 4 shows the antisense detections with clone 2 probe (upper row) appear to be 
restricted to the fungi illustrated by Calcoflour White staining of the succeeding 
section (lower row) (46x and 108x). 

Figure 5 shows intense antisense detections with clone 2 probe are found over the 
fungi in Gracilariopsis lemaneiformis (294x). 

Figure 6 shows a map of plasmid pGLl - which is a pBluescript II KS containing a 
3.8 kb fragment isolated from a genomic library constructed from fungal infected 
Gracilariopsis lemaneiformis. The fragment contains a gene coding for alpha- 1,4- 
glucan lyase. 

Figure 7 shows a map of plasmid pGL2 - which is a pBluescript II SK containing a 
3.6 kb fragment isolated from a genomic library constructed from fungal infected 
Gracilariopsis lemaneiformis. The fragment contains a gene coding for alpha- 1,4- 
glucan lyase. 

Figure 9 shows the alignment of SEQ. I.D. No. 1 (GL1) with SEQ. I.D. No.2 
(GL2). The total number of residues for GL1 is 1088; and the total number of 
residues for GL2 is 1091. In making the comparison, a structure-genetic matrix was 
used (Open gap cost: 10; Unit gap cost: 2). In Figure 9 the character to show that 
two aligned residues are identical is V; and the character to show that two aligned 
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residues are similar is \\ Amino acids said to be 'similar' are: A,S,T; D,E; N,Q; 
R,K; I,L,M,V; F,Y,W. Overall there is an identity of 845 amino acids (i.e. 
77.67%); a similarity of 60 amino acids (5.51%). The number of gaps inserted in 
GL1 are 3 and the number of gaps inserted in GL2 are 2. 

5 

Figure 10 is a microphotograph of a fungal hypha (f) growing between the algal walls 
(w). Note grains of floridean starch (s) and thylakoids (arrows) in the algal cell. 

The following sequence information was used to generate primers for the PCR 
10 reactions mentioned below and to check the amino acid sequence generated by the 

respective nucleotide sequences. 

Amino acid sequence assembled from peptides from fungus infected Gracilariopsis 
lemaneiformis 

15 

Tyr Arg Trp Gin Glu Val Leu Tyr Thr Ala Met Tyr Gin Asn Ala 
Ala Phe Gly Lys Pro He lie Lys Ala Ala Ser Met Tyr Asn Asn 
Asp Ser Asn Val Arg Arg Ala Gin Asn Asp His Phe Leu Leu Gly 
Gly His Asp Gly Tyr Arg He Leu Cys Ala Pro Val Val Trp Glu 
20 Asn Ser Thr Glu Arg Glu Leu Tyr Leu Pro Val Leu Thr Gin Trp 

Tyr Lys Phe Gly Pro Asp Phe Asp Thr Lys Pro Leu Glu Gly Ala 

The Amino acid sequence (27-34) used to generate primer A and B fMet Tyr Asn 
Asn Asp Ser Asn Val) 

25 

Primer A 

ATG TA(TC) AA(CT) AA(CT) GA(CT) TC(GATC) AA(CT) GT 128 mix 



30 



Primer B 

ATG TA(TC) AA(CT) AA(CT) GA(CT) AG(CT) AA(CT) GT 64 mix 
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The Amino acid sequence (45-50 used to generate primer C (Gly Glv His Asp Gly 
Tyr) 

Primer C 

5 TA (GATC)CC (GA)TC (GA)TG (GATC)CC (GATC)CC 256 mix 

[The sequence corresponds to the complementary strand.] 

The Amino acid sequence (74-791 used to generate primer E (Gin Tip Tyr Lys Phe 

10 

Primer E 

GG(GATC) CC(GA) AA(CT) TT(GA) TAC CA(CT) TG 64 mix 
[The sequence corresponds to the complementary strand.] 

15 The Amino acid sequence (1-6) used to generate primer Fl and F2 (Tyr Arg Trp Gin 

Glu Vail 

Primer Fl 

TA(TC) CG(GATC) TGG CA(GA) GA(GA) GT 32 mix 

20 

Primer F2 

TA(TC) AG(GA) TGG CA(GA) GA(GA) GT 16 mix 

The sequence obtained from the first PCR amplification (clone D 

25 

ATGTACAACA ACGACTCGAA CGTTCGCAGG GCGCAGAACG ATCATTTCCT 
TCTTGGCGGC CACGACGGTT A 

Met Tyr Asn Asn Asp Ser Asn Val Arg Arg Ala Gin Asn Asp His Phe Leu Leu Gly 
30 Gly His Asp Gly 
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The sequence obtained from the second PCR amplification (clone 1) 
ATGTACAACA ACGACTCGAA CGTTCGCAGG GCGCAGAACG ATCATTTCCT 
TCTTGGTGG A CATGATGG AT ATCGCATTCT GTGCGCGCCT GTTGTGTGGG 
AGAATTCGAC CGAACGNGAA TTGTACTTGC CCGTGCTGAC CCAATGGTAC 
5 AAATTCGGCC C 



Met Tyr Asn Asn Asp Ser Asn Val Arg Arg Ala Gin Asn Asp His Phe Leu Leu Gly 
Gly His Asp Gly Tyr Arg He Leu Cys Ala Pro Val Val Trp Glu Asn Ser Thr Glu 
Arg Glu Leu Tyr Leu Pro Val Leu Thr Gin Trp Tyr Lys Phe Gly Pro 

The sequence obtained from the third PCR amplification (clone2^ 



TACAGGTGGC AGGAGGTGTT GTACACTGCT ATGTACCAGA 
ATGCGGCTTT CGGGAAACCG ATTATCAAGG CAGCTTCCAT 
15 GTACGACAAC GACAGAAACG TTCGCGGCGC ACAGGATGAC 

CACTTCCTTC TCGGCGGACA CG ATGG AT AT CGTATTTTGT 
GTGCACCTGT TGTGTGGGAG AATACAACCA GTCGCGATCT 
GTACTTGCCT GTGCTGACCA GTGGTACAAA TTCGGCCC 

20 Tyr Arg Trp Gin Glu Val Leu Tyr Thr Ala Met Tyr Gin Asn Ala Ala Phe Gly Lys 

Pro He He Lys Ala Ala Ser Met Tyr Asp Asn Asp Arg Asn Val Arg Gly Ala Gin Asp 
Asp His Phe Leu Leu Gly Gly His Asp Gly Tyr Arg He Leu Cys Ala Pro Val Val 
Trp Glu Asn Thr Thr Ser Arg Asp Leu Tyr Leu Pro Val Leu Thr Lys Trp Tyr Lys 
Phe Gly 



25 
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1^ CYTOLOGIC A L INVESTIGATIONS OF GRACILARIOPSIS 

LEMANEIFORMIS 

1.1.1 Detection of fungal infection in Gracilariopsis lemaneiformis 

Sections of Gracilariopsis lemaneiformis collected in China were either hand cut or 
cut from paraffin embedded material. Sectioned material was carefully investigated 
by light microscopy. Fungal hyphae were clearly detected in Gracilariopsis 
lemaneiformis. 

The thalli of the Gracilariopsis lemaneiformis are composed of cells appearing in a 
highly ordered and almost symmetric manner. The tubular thallus of G. 
lemaneiformis is composed of large, colourless central cells surrounded by elongated, 
slender, ellyptical cells and small, round, red pigmented peripherial cells. All algal 
cell types are characterized by thick cell walls. Most of the fungal hyphae are found 
at the interphase between the central layer of large cells and the peripherial layer. 
These cells can clearly be distinguished from the algae cells as they axe long and 
cylindrical. The growth of the hyphae is observed as irregularities between the highly 
ordered algae cells. The most frequent orientation of the hypha is along the main 
axis of the algal thallus. Side branches toward the central and periphery are detected 
in some cases. The hypha can not be confused with the endo/epiphytic 2nd generation 
of the algae. 

Calcofluor White is known to stain chitin and cellulose containing tissue. The reaction 
with chitin requires four covalently linked terminal n-acetyl glucosamine residues. 
It is generally accepted that cellulose is almost restricted to higher plants although it 
might occur in trace amounts in some algae. It is further known that chitin is absent 
in Gracilaria. 

Calcofluor White was found to stain domains corresponding to fungi hyfa cell walls 
in sectioned Gracilariopsis lemaneiformis material. 
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The hypha appear clear white against a faint blue background of Gracilaria tissue 
when observed under u.v. light - see Figure 1. Chitin is the major cell wall 
component in most fungi but absent in Gracilaria. Based upon these observations we 
conclude that the investigated algae is infected by a fungi. 40% of the lower parts 
5 of the investigated Gracilariopsis lemaneiformis sections were found to be infected 

with fungal hyphae. In the algae tips 25% of the investigated Gracilariopsis 
lemaneiformis sections were found to be infected. 

Staining of sectioned Gracilariopsis lemaneiformis with Periodic acid Schiff (PAS) 
10 and Aniline blue black revealed a significantly higher content of carbohydrates within 

the fungal cells as compared with the algae cells - see Figure 2. Safranin O and 
Malachit Green showed the same colour reaction of fungi cells as found in higher 
plants infected with fungi. 

15 An Acridin Orange reaction with sectioned Gracilariopsis lemaneiformis showed 

clearly the irregularly growth of the fungus. 

1.1.2 Electron Microscopy 

20 Slides with 15 fim thick sections, where the fungus was detected with Calcofluor 

White were fixed in 2% Os0 4> washed in water and dehydrated in dimethoxypropane 
and absolute alcohol. A drop of a 1:1 mixture of acetone and Spurr resin was placed 
over each section on the glass slide, and after one hour replaced by a drop of pure 
resin. A gelatin embedding capsule filled with resin was placed face down over the 

25 section and left over night at 4°C. After the polymerization at 55°C for 8 hrs, the 

thick sections adhering to the resin blocks could can be separated from the slide by 
immersion in liquid nitrogen. 

Blocks were trimmed and 100 nm thick sections were cut using a diamond knife on 
30 a microtome. The sections were stained in aqueous uranyl acetate and in lead citrate. 

The sections were examined in an electron microscope at 80 kV. 
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The investigation confirmed the ligth microscopical observations and provided further 
evidence that the lyase producing, Chinese strain of G. lamneiformis is infected by a 
fungal parasite or symbiont. 

Fungal hyphae are build of tubular cells 50 to 100 fim long and only few microns in 
diameter. The cells are serially arranged with septate walls between the adjacent cells. 
Ocasional branches are also seen. The hyphae grow between the thick cell walls of 
algal thallus without penetrating the wall or damaging the cell. Such a symbiotic 
association, called mycophycobiosis, is known to occur between some filamentous 
marine fungi and large marine algae (Donk and Bruning, 1992 - Ecology of aquatic 
fungi in and on algae. In Reisser, W.(ed.): Algae and Symbioses: Plants, Animals, 
Fungi, Viruses, Interactions Explored. Biopress Ltd. .Bristol.) 

Examining the microphotograph in Figure 10, several differences between algal and 
fungal cells can be noticed. In contrast to several fxm thick walls of the alga, the 
fungal walls are only 100-200 nm thick. Plant typical organells as chloroplasts with 
thyllacoid membranes as well as floridean starch grains can be seen in algal cells, but 
not in the fungus. 

Intercellular connections of red algae are characterized by specific structures termed 
pit plugs, or pit connections The structures are prominent, electron dense cores and 
they are important features in algal taxonomy (Pueschel, CM.: An expanded survey 
of the ultrastructure of Red algal pit plugs. J. Phycol. 25, 625, (1989)). In our 
material, such connections were frequently observed in the algal thallus, but never 
between the cells of the fungus. 

1.2 In situ Hybridization experiments 

In situ hybridization technique is based upon the principle of hybridization of an 
antisense ribonucotide sequence to the mRNA. The technique is used to visualize 
areas in microscopic sections where said mRNA is present. In this particular case the 
technique is used to localize the enzyme a-l,4-glucan lyase in sections of 
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Gracilariopsis lemaneiformis . 

1.2.1 Preparation of 35 S labelled probes for In situ hybridization 

A 238 bp PCR fragment from a third PCR amplification - called clone 2 (see above) - 
was cloned into the pGEM-3Zf(+) Vector (Promega). The transcription of the 
antisense RNA was driven by the SP6 promotor, and the sense RNA by the T7 
promotor. The Ribonuclease protection assay kit (Ambion) was used with the 
following modifications. The transcripts were run on a 6% sequencing gel to remove 
the unincorporated nucleotide and eluted with the elution buffer supplied with the 
T7RNA polymerase in vitro Transcription Kit (Ambion). The antisense transcript 
contained 23 non-coding nucleotides while the sense contained 39. For hybridization 
10 7 cpm/ml of the 35 S labelled probe was used. 

In situ hybridisation was performed essentially as described by Langedale 
et.al.(1988). The hybridization temperature was found to be optimal at 45°C. After 
washing at 45°C the sections were covered with KodaK K-5 photographic emulsion 
and left for 3 days at 5°C in dark (Ref: Langedale, J. A., Rothermel, B.A. and 
Nelson, T. (1988). Genes and development 2: 106-115. Cold Spring Harbour Labora- 
tory). 

The in situ hybridization experiments with riboprobes against the mRNA of a- 1,4- 
glucan lyase, show strong hybridizations over and around the hypha of the fungus 
detected in Gracilariopsis lemaneiformis - see Figures 4 and 5. This is considered 
a strong indication that the a-l,4-glucan lyase is produced. A weak random 
background reactions were detected in the algae tissue of both Gracilariopsis 
lemaneiformis. This reaction was observed both with the sense and the antisense 
probes. Intense staining over the fungi hypha was only obtained with antisense 
probes. 

These results were obtained with standard hybridisation conditions at 45°C in 
hybridization and washing steps. At 50°C no staining over the fungi was observed, 
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whereas the background staining remained the same. Raising the temperature to 55°C 
reduced the background staining with both sense and antisense probes significantly 
and equally. 

5 Based upon the cytological investigations using complementary staining procedures 

it is concluded that Gracilariopsis lemaneiformis is fungus infected. The infections 
are most pronounced in the lower parts of the algal tissue. 

In sectioned Gracilariopsis lemaneiformis material in situ hybridization results clearly 
10 indicate that hybridization is restricted to areas where fungal infections are found - 
see Figure 4. The results indicate that a-l,4-glucan lyase mRNA appears to be 
restricted to fungus infected areas in Gracilariopsis lemaneiformis. 

Based upon these observations we conclude that <*-l ,4-glucan lyase activity is detected 
15 in fungally infected Gracilariopsis lemaneiformis. 

2. ENZYME PURIFICATION AND CHARACTERIZATION 



Purification of a- 1 ,4-glucan lyase from fungal infected Gracilariopsis lemaneiformis 
20 material was performed as follows. 

2. 1 Materials and Methods 

The algae were harvested by filtration and washed with 0.9% NaCl. The cells were 
25 broken by homogenization followed by sonication on ice for 6x3 min in 50 mM 

citrate-NaOH pH 6.2 (Buffer A). Cell debris were removed by centrifugation at 
25,000xg for 40 min. The supernatant obtained at this procedure was regarded as 
cell-free extract and was used for activity staining and Western blotting after 
separation on 8-25% gradient gels. 



30 
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2.2 Separation by /?-cyclodextrin Sepharose gel 

The cell-free extract was applied directly to a /3-cyclodextrin Sepharose gel 4B 
ciolumn ( 2.6 x 18 cm) pre equilibrated with Buffer A. The column was washed 
5 with 3 volumes of Buffer A and 2 volumes of Buffer A containing 1 M NaCl. a-1,4- 

glucan lyase was eluted with 2 % dextrins in Buffer A. Active fractions were pooled 
and the buffer changed to 20 mM Bis-tris propane-HCl (pH 7.0, Buffer B). 

Active fractions were applied onto a Mono Q HR 5/5 column pre-equilibrated with 
10 Buffer B. The fungal lyase was eluted with Buffer B in a linear gradient of 0.3 M 

NaCL 

The lyase preparation obtained after /3-cyclodextrin Sepharose chromatography was 
alternatively concentrated to 150 and applied on a Superose 12 column operated 
15 under FPLC conditions. 

2.3 Assay for a-l,4-glucan lyase activity and conditions for determination of 
substrate specificity, pH and temperature optimum 

20 The reaction mixture for the assay of the a-1 ,4-glucan lyase activity contained 10 mg 

ml" 1 amylopectin and 25 mM Mes-NaOH (pH 6.0). The reaction was carried out at 
30°C for 30 min and stopped by the addition of 3,5-dinitrosalicylic acid reagent. 
Optical density at 550nm was measured after standing at room temperature for 10 
min. 

25 
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3. AMIN O ACID SEQUENCING OF THE «-l,4-GLUCAN LYASE FROM 
FUNGUS INFECTED GRACIIJUUQPSIS LEMANEIFORMIS 

3. 1 Amino acid sequencing of the lyases 

The lyases were digested with either endoproteinase Arg-C from Clostridium 
histolyticwn or endoproteinase Lys-C from Lysobacter enzyrnogenes, both sequencing 
grade purchased from Boehringer Mannheim, Germany. For digestion with 
endoproteinase Arg-C, freeze dried lyase (0. 1 mg) was dissolved in 50 /xl 10 M urea, 
50 mM methylamine, 0.1 M Tris-HCl, pH 7.6. After overlay with N 2 and addition 
of 10 /xl of 50 mM DTT and 5 mM EDTA the protein was denatured and reduced for 
10 min at 50°C under N 2 . Subsequently, 1 /xg of endoproteinase Arg-C in 10 /xl of 50 
mM Tris-HCl, pH 8.0 was added, N 2 was overlayed and the digestion was carried out 
for 6h at 37°C. For subsequent cysteine derivatization, 12.5 /xl 100 mM iodoaceta- 
mide was added and the solution was incubated for 15 min at RT in the dark under 
N 2 . 

For digestion with endoproteinase Lys-C, freeze dried lyase (0.1 mg) was dissolved 
in 50 /xl of 8 M urea, 0.4 M NI^HCOa, pH 8.4. After overlay with N 2 and addition 
of 5 pi of 45 mM DTT, the protein was denatured and reduced for 15 min at 50°C 
under N 2 . After cooling to RT, 5 /xl of 100 mM iodoacetamide was added for the 
cysteines to be derivatized for 15 min at RT in the dark under N 2 . 

Subsequently, 90 /xl of water and 5 /xg of endoproteinase Lys-C in 50 /xl of 50 mM 
tricine and 10 mM EDTA, pH 8.0, was added and the digestion was carried out for 
24h at 37°C under N 2 . 

The resulting peptides were separated by reversed phase HPLC on a VYDAC CI 8 
column (0.46 x 15 cm; 10 /xm; The Separations Group; California) using solvent A: 
0. 1 % TFA in water and solvent B: 0. 1 % TFA in acetonitrile. Selected peptides were 
rechromatographed on a Develosil C18 column (0.46 x 10 cm; 3 /xm; Dr. Ole Schou, 
Novo Nordisk, Denmark) using the same solvent system prior to sequencing on an 



WO 95/10618 



PCT/EP94/03399 



18 

Applied Biosystems 476A sequencer using pulsed-liquid fast cycles. 

The amino acid sequence information from the enzyme derived from fungus infected 
Gracilariopsis lemaneiformis is shown below, in particular SEQ. ID. No. 1. and 
SEQ. ID. No. 2. 



SEP. I.D. No. 1 has: 
Number of residues : 1088. 

Amino acid composition (including the signal sequence) 



61 Ala 
51 Arg 
88 Asn 
79 Asp 

SEP. LP. No. 2 has: 
Number of residues : 1091. 

Amino acid composition (including the signal sequence) 



15 Cys 
42 Gin 
53 Glu 
100 Gly 



19 His 
43 He 
63 Leu 
37Lys 



34 Met 
53 Phe 
51 Pro 
62 Ser 



78 Thr 
24 Trp 
58 Tyr 
77 Val 



20 



58 Ala 16 Cys 14 His 34 Met 68 Thr 

57 Arg 40 Gin 44 He 56 Phe 23 Trp 

84 Asn 47 Glu 69 Leu 51 Pro 61 Tyr 

81 Asp 102 Gly 50 Lys 60 Ser 76 Val 



25 3.2 N-TERMINAL ANALYSIS 



Studies showed that the N- terminal sequence of native glucan lyase 1 was blocked. 
Deblocking was achieved by treating glucan lyase 1 blotted onto a PVDF membrane 
with anhydrous TFA for 30 min at 40°C essentially as described by LeGendre et al. 
30 (1993) [Purification of proteins and peptides by SDS-PAGE; In: Matsudaira, P. (ed.) 

A practical guide to protein and peptide purification for microsequencing, 2nd edition; 
Academic Press Inc., San Diego; pp. 74-101.]. The sequence obtained was 
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15 



20 



25 



TALSDKQTA, which matches the sequence (sequence position from 51 to 59 of 
SEQ. LD. No.l) derived from the clone for glucan lyase 1 and indicates N- 
acetylthreonine as N-terminal residue of glucan lyase 1 . Sequence position 1 to 50 
of SEQ. LD. No. 1 represents a signal sequence. 

4. DNA S EQUENCING OF GENES CODING FOR THE «-1.4-GLUCAN 
LYASE FR OM FUNGUS INFECTED GRACILARIOPSIS LEMANEIFORMTfi 

4. 1 METHODS FOR MOLECULAR BIOLOGY 

DNA was isolated as described by Saunders (1993) with the following modification: 
The polysaccharides were removed from the DNA by ELUTIP-d (Schleicher & 
Schuell) purification instead of gel purification. (Ref: Saunders, G.W. (1993). Gel 
purification of red algal genomic DNA: An inexpensive and rapid method for the 
isolation of PCR-friendly DNA. Journal of phycology 29(2): 251-254 and Schleicher 
& Schuell: ELUTIP-d. Rapid Method for Purification and Concentration of DNA.) 

4.2 PCR 

The preparation of the relevant DNA molecule was done by use of the Gene Amp 
DNA Amplification Kit (Perkin Elmer Cetus, USA) and in accordance with the 
manufactures instructions except that the Taq polymerase was added later (see PCR 
cycles) and the temperature cycling was changed to the following: 
PCR cycles: 

no of cycles c time (min.) 

1 98 5 



60 

addition of Taq polymerase 
35 94 



and oil 



5 



1 



47 



2 



72 



3 



1 



72 



20 
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4.3 CLONING OF PCR FRAGMENTS 

PGR fragments were cloned into pT7Blue (from Novagen) following the instructions 
of the supplier. 

5 

4.4 DNA SEQUENCING 

Double stranded DNA was sequenced essentially according to the dideoxy method of 
Sanger et al. (1979) using the Auto Read Sequencing Kit (Pharmacia) and the 
10 Pharmacia LKB A.L.F.DNA sequencer. (Ref: Sanger, F., Nicklen, S. and Coulson, 

A.R.(1979). DNA sequencing with chain-determinating inhibitors. Proc. Natl. Acad. 
Sci. USA 74: 5463-5467.). 

The sequences are shown as SEQ. LD.No.s 3 and 4, wherein 

15 

SEP. I.D. No. 3 has: 

Total number of bases is: 3267. 

DNA sequence composition: 850 A; 761 C; 871 G; 785 T 

20 SEP. I.D. No. 4 has: 

Total number of bases is: 3276. 

DNA sequence composition: 889 A; 702 C; 856 G; 829 T 

4.5 SCREENING GF THE LIBRARY 

25 

Screening of the Lambda Zap library obtained from Stratagene, was performed in 
accordance with the manufacturer's instructions except that the prehybridization and 
hybridization was performed in 2xSSC, 0.1% SDS, lOxDenhardt's and 100^g/ml 
denatured salmon sperm DNA. To the hybridization solution a 32P-labeled denatured 
30 probe was added. Hybridization was performed over night at 55°C. The filters were 

washed twice in 2xSSC, 0.1% SDS and twice in lxSSC, 0.1% SDS. 



WO 95/10618 



PCT/EP94/03399 



21 

4.6 PROBE 

The cloned PGR fragments were isolated from the pT7 blue vector by digestion with 
appropriate restriction enzymes. The fragments were seperated from the vector by 
agarose gel electrophoresis and the fragments were purified from the agarose by 
Agarase (Boehringer Mannheim). As the fragments were only 90-240 bp long the 
isolated fragments were exposed to a ligation reaction before labelling with 32P-dCTP 
using either Prime-It random primer kit (Stratagene) or Ready to Go DNA labelling 
kit (Pharmacia). 

4.7 RESULTS 

4.7.1 Generation of PCR DNA fragments coding for a-l,4-glucan lyase. 

The amino acid sequences of three overlapping tryptic peptides from a-l,4-glucan 
lyase were used to generate mixed oligonucleotides, which could be used as PCR 
primers (see the sequences given above). 

In the first PCR amplification primers A/B (see above) were used as upstream 
primers and primer C (see above) was used as downstream primer. The size of the 
expected PCR product was 71 base pairs. 

In the second PCR amplification primers A/B were used as upstream primers and E 
was used as downstream primer. The size of the expected PCR product was 161 base 
pairs. 

In the third PCR amplification primers Fl (see above) and F2 (see above) were used 
as upstream primers and E was used as downstream primer. The size of the expected 
PCR product was 238 base pairs. The PCR products were analysed on a 2% LMT 
agarose gel and fragments of the expected sizes were cut out from the gel and treated 
with Agarase (Boehringer Manheim) and cloned into the pT7b!ue Vector (Novagen) 
and sequenced. 
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The cloned fragments from the first and second PCR amplification coded for amino 
acids corresponding to the sequenced peptides (see above). The clone from the third 
amplification (see above) was only about 87% homologous to the sequenced peptides. 

5 4.7.2 Screening of the genomic library with the cloned PCR fragments. 

Screening of the library with the above-mentioned clones gave two clones. One clone 
contained the nucleotide sequence of SEQ I.D. No. 4 (gene 2). The other clone 
contained some of the sequence of SEQ I.D. No.3 (from base pair 1065 downwards) 
10 (gene 1). 

The 5' end of SEQ. I.D. No. 3 (i.e. from base pair 1064 upwards) was obtained by 
the RACE (rapid amplification of cDNA ends) procedure (Michael, A.F., Michael, 
K.D. & Martin, G.R.(1988). Proc.Natl.Acad.Sci.USA 85:8998-99002.) using the 

15 5' race system from Gibco BRL. Total RNA was isolated according to Collinge et 

al.(Collinge, D.B., Milligan D.E:, Dow, J.M., Scofield, G.& Daniels, M.J. (1987). 
Plant Mol Biol 8: 405-414). The 5' race was done according to the protocol of the 
manufacturer, using I fig of total RNA. The PCR product from the second 
ammplification was cloned into pT7blue vector from Novagen according to the 

20 protocol of the manufacturer. Three independent PCR clones were sequenced to 

compensate for PCR errors. 

An additional PCR was performed to supplement the clone just described with Xbal 
and Ndel restriction sites immediately in front of the ATG start codon using the 
25 following oligonucleotide as an upstream primer: 

GCTCTAGAGC ATG 1T1TCAACCCTTGCG and a primer containing the 
complement sequence of bp 1573-1593 in sequence GL1 (i.e. SEQ. I.D. No. 3) was 
used as a downstream primer. 

30 The complete sequence for gene 1 (i.e. SEQ. I.D. No. 3) was generated by cloning 

the 3* end of the gene as a BamHI-Hindlll fragment from the genomic clone into the 
pBluescript II KS-h vector from Stratagene and additionally cloning the PCR 
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generated 5' end of the gene as a Xbal-BamHI fragment in front of the 3' end. 

Gene 2 was cloned as a Hindlll blunt ended fragment into the EcoRV site of 
pBluescript II SK+ vector from Stratagene. A part of the 3 1 untranslated sequence 
5 was removed by a SacI digestion, followed by religation. Hindlll and Hpal 

restriction sites were introduced immediately in front of the start ATG by digestion 
with Hindlll and Narl and religation in the presence of the following annealed 
oligonucleotides 

10 AGCTTGTTAAC ATG TATCCAACCCTCACCTTCGTGG 

ACAATTGTACATAGGTTGGGAGTGGAAGCACCGC 

No introns were found in the clones sequenced. 

15 The clone 1 type (SEQ.ID.No.3) can be aligned with all ten peptide sequences (see 

Figure 8) showing 100% identity. Alignment of the two protein sequences encoded 
by the genes isolated from the fungal infected algae Gracilariopsis lemaneiformis 
shows about 78% identity, indicating that both genes are coding for a a-1.4-glucan 
lyase. 

20 

5. EXPRESSION OF THE GL GENE IN MICRO-ORGANISMS 

(E.G. ANALYSES OF PICHIA LYASE TRANSFORMANTS AND 

ASPERGILLUS LYASE TRANSFORMANTS) 

25 The DNA sequence encoding the GL was introduced into microorganisms to produce 

an enzyme with high specific activity and in large quantities. 

In this regard, gene 1 (i.e. SEQ. I.D. No. 3) was cloned as a Notl-Hindlll blunt 
ended (using the DNA blunting kit from Amersham International) fragment into the 
30 Pichia expression vector pHIL-D2 (containing the AOX1 promoter) digested with 

EcoRI and blunt ended (using the DNA blunting kit from Amersham International) 
for expression in Pichia pastoris (according to the protocol stated in the Pichia 



WO 95/10618 



PCT/EP94/03399 



24 

Expression Kit supplied by Invitrogen). 

In another embodiment, the gene 1 (i.e. SEQ. I.D. No. 3) was cloned as a Notl- 
Hindlll blunt ended fragment (using the DNA blunting kit from Amersham 
International) into the Aspergillus expression vector pBARMTEl (containing the 
methyl tryptophan resistance promoter from Neuropera crassa) digested with Smal 
for expression in Aspergillus niger (Pall et al (1993) Fungal Genet Newsiett. vol 40 
pages 59-62). The protoplasts were prepared according to Daboussi et al (Curr Genet 
(1989) vol 15 pp 453-456) using lysing enzymes Sigma L-2773 and the lyticase Sigma 
L-8012. The transformation of the protoplasts was followed according to the protocol 
stated by Buxton et al (Gene (1985) vol 37 pp 207-214) except that for plating the 
transformed protoplasts the protocol laid out in Punt et al (Methods in Enzymology 
(1992) vol 216 pp 447 - 457) was followed but with the use of 0.6% osmotic 
stabilised top agarose. 

The results showed that lyase activity was observed in the transformed Pichia pastoris 
and Aspergillus niger. 

5 A GENERAL METHODS 

Preparation of cell-free extracts. 

The cells were harvested by centrifugation at 9000 rpm for 5 min and washed with 
0.9% NaCl and resuspended in the breaking buffer (50mM K-phosphate, pH 7.5 
containing ImM of EDTA, and 5% glycerol). Cells were broken using glass beads 
and vortex treatment. The breaking buffer contained 1 mM PMSF (protease inhibi- 
tor). The lyase extract (supernatant) was obtained after centrifugation at 9000 rpm for 
5 min followed by centrifugation at 20,000 xg for 5min. 

Assay of lyase activity by alkaline 3,5-dinitrosalicylic acid reagent (DNS) 

One volume of lyase extract was mixed with an equal volume of 4% amylopectin 
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solution. The reaction mixture was then incubated at a controlled temperature and 
samples vere removed at specified intervals and analyzed for AF. 

The lyase activity was also analyzed using a radioactive method. 

The reaction mixture contained 10 /xl 14 C-starch solution (1 /*Ci; Sigma Chemicals 
Co.) and 10 /xl of the lyase extract. The reaction mixture was left at 25°C overnight 
and was then analyzed in the usual TLC system. The radioactive AF produced was 
detected using an Instant Imager (Pachard Instrument Co., Inc., Meriden, CT). 

Electrophoresis and Western blotting 

SDS-PAGE was performed using 8-25% gradient gels and the PhastSystem 
(Pharmacia). Western blottings was also run on a Semidry transfer unit of the 
PhastSystem. 

Primary antibodies raised against the lyase purified from the red seaweed collected 
at Qingdao (China) were used in a dilution of 1:100. Pig antirabbit IgG conjugated 
to alkaline phosphatase (Dako A/S, Glostrup, Denmark) were used as secondary 
antibodies and used in a dilution of 1: 1000. 

Part I, Analysis of the Pichia transformantscontaining the above mentioned 
construct 



Results: 



WO 95/10618 



PCT/EP94/03399 



26 

1. Lyase activity was determined 5 days after induction (according to the manual) and 
proved the activity to be intracellular for all samples in the B series. 



Samples of B series: 11 12 13 15 26 27 28 29 30 
5 

Specific activity: 139 81 122 192 151 253 199 198 150 



^Specific activity is defined as nmol AF released per min per mg protein in a reaction 
mixture containing 2% (w/v) of glycogen, 1% (w/v) glycerol in 10 mM potassium 
10 phosphate buffer (pH 7.5). The reaction temperature was 45°C; the reaction time was 

60 min. 

A time course of sample B27 is as follows. The data are also presented in Figure 1 . 
15 Time (min) 0 10 20 30 40 50 60 



Spec. act. 0 18 54 90 147 179 253 



Assay conditions were as above except that the time was varied. 

20 

2. Western-blotting analysis. 

The CFE of all samples showed bands with a molecular weight corresponding to the 
native lyase. 

25 
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MC-Lyase expressed intracellularly in Pichia pastoris 



Names of culture 


Specific activity* 


A18 


10 


A20 


. 32 


A21 


8 


A22 


8 


A24 


6 



Part II, The Aspergilus transformants 
Results 

20 

I. Lyase activity was determined after 5 days incubation(minimai medium 
containing 0.2% casein enzymatic hydrolysate analysis by the alkaline 3,5- 
dinitrosaiicylic acid reagent 

25 1). Lyase activity analysis of the culture medium 

Among 35 cultures grown with 0.2% amylopectin included in the culture medium, 
AF was only detectable in two cultures. The culture medium of 5.4+ and 5.9 + 
contained 0.13 g AF/liter and 0.44 g/liter, respectively. The result indicated that 
30 active lyase had been secreted from the cells. Lyase activity was also measurable in 

the cell-free extract. 
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2). Lyase activity analysis in cell-free extracts 



10 



Name of the culture 


Specific activity* 


C A 1 

5.4-f 


51 


f A 1 

5.9+ 


1 A O 

148 


5.13 


99 


5.15 


25 


5.19 


37 



*The specific activity was defined as nmol of AF produced per min per mg protein 
at 25°C. + indicates that 0.2% amylopectin was added. 

The results show that Gene 1 of GL was expressed intracellular in A. niger. 

20 

Experiments with transformed E.coli (using cloning vectors pQE30 from the Qia 
express vector kit from Qiagen) showed expression of enzyme that was recognised 
by anti-body to the enzyme purified from fungally infected Gracilariopsis 
lemaneiformis. 

25 

Instead of Aspergillus niger as host, other industrial important microorganisms for 
which good expression systems are known could be used such as: Aspergillus oryzae, 
Aspergillus sp., Trichoderma sp., Saccharomyces cerevisiae, Kluyveromyces sp., 
Hansenula sp., Pichia sp., Bacillus subtilis, B. amyloliquefaciens , Bacillus sp,, 
30 Streptomyces sp. or E. coli. 
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Other preferred embodiments of the present invention include any one of the 
following: A transformed host organism having the capability of producing AF as 
a consequence of the introduction of a DNA sequence as herein described; such a 
transformed host organism which is a microorganism - preferably wherein the host 
5 organism is selected from the group consisting of bacteria, moulds, fungi and yeast; 

preferably the host organism is selected from the group consisting of Saccharomyces , 
Kluyveromyces, Aspergillus, Trichoderma Hansenula, Pichia, Bacillus Streptomyces, 
Eschericia such as Aspergillus oryzae, Saccharomyces cerevisiae, bacillus sublilis, 
Bacillus amyloliquefascien, Eschericia coli. ;A method for preparing the sugar 1,5-D- 

10 anhydrofructose comprising contacting an alpha 1,4-glucan (e.g. starch) with the 
enzyme c*- 1,4-glucan lyase expressed by a transformed host organism comprising a 
nucleotide sequence encoding the same, preferably wherein the nucleotide sequence 
is a DNA sequence, preferably wherein the DNA sequence is one of the sequences 
hereinbefore described; A vector incorporating a nucleotide sequence as hereinbefore 

15 described, preferably wherein the vector is a replication vector, preferably wherein 

the vector is an expression vector containing the nucleotide sequence downstream 
from a promoter sequence, the vector preferably containing a marker (such as a 
resistance marker); Cellular organisms, or cell line, transformed with such a vector; 
A method of producing the product or- 1,4-glucan lyase or any nucleotide sequence or 

20 part thereof coding for same, which comprises culturing such an organism (or cells 

from a cell line) transfected with such a vector and recovering the product. 

Other modifications of the present invention will be apparent to those skilled in the 
art without departing from the scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: DANISCO A/S 

(B) STREET: LANGEBROGADE 1 

(C) CITY: COPENHAGEN 

(D) STATE: COPENHAGEN K 

(E) COUNTRY: DENMARK 

(F) POSTAL CODE (ZIP): DK-1001 

(ii) TITLE OF INVENTION: ENZYME 
(iii) NUMBER OF SEQUENCES: 20 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: . PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/EP94/03399 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1088 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met Phe Ser Thr Leu Ala Phe Val Ala Pro Ser Ala Leu Gly Ala Ser 
15 10 15 

Thr Phe Val Gly Ala Glu Val Arg Ser Asn Val Arg He His Ser Ala 
20 25 30 

Phe Pro Ala Val His Thr Ala Thr Arg Lys Thr Asn Arg Leu Asn Val 
35 40 45 

Ser Met Thr Ala Leu Ser Asp Lys Gin Thr Ala Thr Ala Gly Ser Thr 
50 55 60 

Asp Asn Pro Asp Gly He Asp Tyr Lys Thr Tyr Asp Tyr Val Gly Val 
65 70 75 80 

Trp Gly Phe Ser Pro Leu Ser Asn Thr Asn Trp Phe Ala Ala Gly Ser 
85 90 95 

SUBSTITUTE SHEET (RULE 26) 
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Ser Thr Pro Gly Gly He Thr Asp Trp Thr Ala Thr Met Asn Val Asn 
100 105 110 

Phe Asp Arg He Asp Asn Pro Ser He Thr Val Gin His Pro Val Gin 
115 120 125 

Val Gin Val Thr Ser Tyr Asn Asn Asn Ser Tyr Arg Val Arg Phe Asn 
130 135 140 

Pro Asp Gly Pro He Arg Asp Val Thr Arg Gly Pro lie Leu Lys Gin 
145 150 155 160 

Gin Leu Asp Trp lie Arg Thr Gin Glu Leu Ser Glu Gly Cys Asp Pro 
165 170 175 

Gly Met Thr Phe Thr Ser Glu Gly Phe Leu Thr Phe Glu Thr Lys Asp 
180 185 190 

Leu Ser Val lie He Tyr Gly Asn Phe Lys Thr Arg Val Thr Arg Lys 
195 200 205 

Ser Asp Gly Lys Val He Met Glu Asn Asp Glu Val Gly Thr Ala Ser 
210 215 220 

Ser Gly Asn Lys Cys Arg Gly Leu Met Phe Val Asp Arg Leu Tyr Gly 
225 230 235 240 

Asn Ala He Ala Ser Val Asn Lys Asn Phe Arg Asn Asp Ala Val Lys 
245 250 255 

Gin Glu Gly Phe Tyr Gly Ala Gly Glu Val Asn Cys Lys Tyr Gin Asp 
260 265 270 

Thr Tyr lie Leu Glu Arg Thr Gly lie Ala Met Thr Asn Tyr Asn Tyr 
275 280 285 

Asp Asn Leu Asn Tyr Asn Gin Trp Asp Leu Arg Pro Pro His His Asp 
290 295 300 

Gly Ala Leu Asn Pro Asp Tyr Tyr He Pro Met Tyr Tyr Ala Ala Pro 
305 310 315 320 

Trp Leu He Val Asn Gly Cys Ala Gly Thr Ser Glu Gin Tyr Ser Tyr 
325 330 335 

Gly Trp Phe Met Asp Asn Val Ser Gin Ser Tyr Met Asn Thr Gly Asp 
340 345 350 

Thr Thr Trp Asn Ser Gly Gin Glu Asp Leu Ala Tyr Met Gly Ala Gin 
355 360 365 

Tyr Gly Pro Phe Asp Gin His Phe Val Tyr Gly Ala Gly Gly Gly Met 
370 375 380 



Glu Cys Val Val Thr Ala Phe Ser Leu Leu Gin Gly Lys Glu Phe Glu 
385 390 395 400 



SUBSTITUTE SHEET (RULE 26) 
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Asn Gin Val Leu Asn Lys Arg Ser Val Met Pro Pro Lys Tyr Val Phe 
405 410 415 • 

Gly Phe Phe Gin Gly Val Phe Gly Thr Ser Ser Leu Leu Arg Ala His 
420 425 430 

Met Pro Ala Gly Glu Asn Asn He Ser Val Glu Glu lie Val Glu Gly 
435 440 445 

Tyr Gin Asn Asn Asn Phe Pro Phe Glu Gly Leu Ala Val Asp Val Asp 
450 455 450 

Met Gin Asp Asn Leu Arg Val Phe Thr Thr Lys Gly Glu Phe Trp Thr 
465 470 475 480 

Ala Asn Arg Val Gly Thr Gly Gly Asp Pro Asn Asn Arg Ser Val Phe 
485 490 495 

Glu Trp Ala His Asp Lys Gly Leu Val Cys Gin Thr Asn He Thr Cys 
500 505 510 

Phe Leu Arg Asn Asp Asn Glu Gly Gin Asp Tyr Glu Val Asn Gin Thr 
515 520 525 

Leu Arg Glu Arg Gin Leu Tyr Thr Lys Asn Asp Ser Leu Thr Gly Thr 
530 535 540 

Asp Phe Gly Met Thr Asp Asp Gly Pro Ser Asp Ala Tyr He Gly His 
545 550 555 560 

Leu Asp Tyr Gly Gly Gly Val Glu Cys Asp Ala Leu Phe Pro Asp Trp 
565 570 575 

Gly Arg Pro Asp Val Ala Glu Trp Trp Gly Asn Asn Tyr Lys Lys Leu 
580 585 590 

Phe Ser He Gly Leu Asp Phe Val Trp Gin Asp Met Thr Val Pro Ala 
595 600 605 

Met Met Pro His Lys He Gly Asp Asp He Asn Val Lys Pro Asp Gly 
610 615 620 

Asn Trp Pro Asn Ala Asp Asp Pro Ser Asn Gly Gin Tyr Asn Trp Lys 
525 630 635 640 

Thr Tyr His Pro Gin Val Leu Val Thr Asp Met Arg Tyr Glu Asn His 
645 650 655 

Gly Arg Glu Pro Met Val Thr Gin Arg Asn He His Ala Tyr Thr Leu 
660 665 670 

Cys Glu Ser Thr Arg Lys Glu Gly He Val Glu Asn Ala Asp Thr Leu 
675 680 685 

Thr Lys Phe Arg Arg Ser Tyr He He Ser Arg Gly Gly Tyr He Gly 
690 695 700 
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Asn Gin His Phe Gly Gly Met Trp Val Gly Asp Asn Ser Thr Thr Ser 
705 710 715 720 

Asn Tyr He Gin Met Met lie Ala Asn Asn lie Asn Met Asn Met Ser 
725 730 735 

Cys Leu Pro Leu Val Gly Ser Asp He Gly Gly Phe Thr Ser Tyr Asp 
740 745 750 

Asn Glu Asn Gin Arg Thr Pro Cys Thr Gly Asp Leu Met Val Arg Tyr 
755 760 765 

Val Gin Ala Gly Cys Leu Leu Pro Trp Phe Arg Asn His Tyr Asp Arg 
770 775 780 

Trp He Glu Ser Lys Asp His Gly Lys Asp Tyr Gin Glu Leu Tyr Met 
785 790 795 800 

Tyr Pro Asn Glu Met Asp Thr Leu Arg Lys Phe Val Glu Phe Arg Tyr 
805 810 815 

Arg Trp Gin Glu Val Leu Tyr Thr Ala Met Tyr Gin Asn Ala Ala Phe 
820 825 830 

Gly Lys Pro He He Lys Ala Ala Ser Met Tyr Asn Asn Asp Ser Asn 
835 840 845 

Val Arg Arg Ala Gin Asn Asp His Phe Leu Leu Gly Gly His Asp Gly 
850 855 860 

Tyr Arg He Leu Cys Ala Pro Val Val Trp Glu Asn Ser Thr Glu Arg 
865 870 875 880 

Glu Leu Tyr Leu Pro Val Leu Thr Gin Trp Tyr Lys Phe Gly Pro Asp 
885 890 895 

Phe Asp Thr Lys Pro Leu Glu Gly Ala Met Asn Gly Gly Asp Arg He 
900 905 910 

Tyr Asn Tyr Pro Val Pro Gin Ser Glu Ser Pro He Phe Val Arg Glu 
915 920 925 

Gly Ala He Leu Pro Thr Arg Tyr Thr Leu Asn Gly Glu Asn Lys Ser 
930 935 940 

Leu Asn Thr Tyr Thr Asp Glu Asp Pro Leu Val Phe Glu Val Phe Pro 
945 950 955 960 

Leu Gly Asn Asn Arg Ala Asp Gly Met Cys Tyr Leu Asp Asp Gly Gly 
965 970 975 

Val Thr Thr Asn Ala Glu Asp Asn Gly Lys Phe Ser Val Val Lys Val 
980 985 990 

Ala Ala Glu Gin Asp Gly Gly Thr Glu Thr lie Thr Phe Thr Asn Asp 
995 1000 1005 
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Cys Tyr Glu Tyr Val Phe Gly Gly Pro Phe Tyr Val Arg Val Arg Gly 
1010 1015 1020 

Ala Gin Ser Pro Ser Asn He His Val Ser Ser Gly Ala Gly Ser Gin 
1025 1030 1035 1040 

Asp Met Lys Val Ser Ser Ala Thr Ser Arg Ala Ala Leu Phe Asn Asp 
1045 1050 1055 

Gly Glu Asn Gly Asp Phe Trp Val Asp Gin Glu Thr Asp Ser Leu Trp 
1060 1065 1070 

Leu Lys Leu Pro Asn Val Val Leu Pro Asp Ala Val He Thr He Thr 
1075 1080 1085 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1091 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Tyr Pro Thr Leu Thr Phe Val Ala Pro Ser Ala Leu Gly Ala Arg 
15 10 15 

Thr Phe Thr Cys Val Gly He Phe Arg Ser His He Leu He His Ser 
20 25 30 

Val Val Pro Ala Val Arg Leu Ala Val Arg Lys Ser Asn Arg Leu Asn 
35 40 45 

Val Ser Met Ser Ala Leu Phe Asp Lys Pro Thr Ala Val Thr Gly Gly 
50 55 60 

Lys Asp Asn Pro Asp Asn He Asn Tyr Thr Thr Tyr Asp Tyr Val Pro 
65 70 75 80 

Val Trp Arg Phe Asp Pro Leu Ser Asn Thr Asn Trp Phe Ala Ala Gly 
85 90 95 

Ser Ser Thr Pro Gly Asp He Asp Asp Trp Thr Ala Thr Met Asn Val 
100 105 110 

Asn Phe Asp Arg He Asp Asn Pro Ser Phe Thr Leu Glu Lys Pro Val 
115 120 125 

Gin Val Gin Val Thr Ser Tyr Lys Asn Asn Cys Phe Arg Val Arg Phe 
130 135 140 
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Asn Pro Asp Gly Pro lie Arg Asp Val Asp Arg Gly Pro He Leu Gin 
145 150 155 160 

Gin Gin Leu Asn Trp He Arg Lys Gin Glu Gin Ser Lys Gly Phe Asp 
165 170 175 

Pro Lys Met Gly Phe Thr Lys Glu Gly Phe Leu Lys Phe Glu Thr Lys 
180 185 190 

Asp Leu Asn Val He He Tyr Gly Asn Phe Lys Thr Arg Val Thr Arg 
195 200 205 

Lys Arg Asp Gly Lys Gly lie Met Glu Asn Asn Glu Val Pro Ala Gly 
210 215 220 

Ser Leu Gly Asn Lys Cys Arg Gly Leu Met Phe Val Asp Arg Leu Tyr 
225 230 235 240 

Gly Thr Ala He Ala Ser Val Asn Glu Asn Tyr Arg Asn Asp Pro Asp 
245 250 255 

Arg Lys Glu Gly Phe Tyr Gly Ala Gly Glu Val Asn Cys Glu Phe Trp 
260 265 270 

Asp Ser Glu Gin Asn Arg Asn Lys Tyr He Leu Glu Arg Thr Gly He 
275 280 285 

Ala Met Thr Asn Tyr Asn Tyr Asp Asn Tyr Asn Tyr Asn Gin Ser Asp 
290 295 300 

Leu lie Ala Pro Gly Tyr Pro Ser Asp Pro Asn Phe Tyr lie Pro Met 
305 310 315 320 

Tyr Phe Ala Ala Pro Trp Val Val Val Lys Gly Cys Ser Gly Asn Ser 
325 330 335 

Asp Glu Gin Tyr Ser Tyr Gly Trp Phe Met Asp Asn Val Ser Gin Thr 
340 345 350 

Tyr Met Asn Thr Gly Gly Thr Ser Trp Asn Cys Gly Glu Glu Asn Leu 
355 360 365 

Ala Tyr Met Gly Ala Gin Cys Gly Pro Phe Asp Gin His Phe Val Tyr 
370 375 380 

Gly Asp Gly Asp Gly Leu Glu Asp Val Val Gin Ala Phe Ser Leu Leu 
385 390 395 400 

Gin Gly Lys Glu Phe Glu Asn Gin Val Leu Asn Lys Arg Ala Val Met 
405 410 415 

Pro Pro Lys Tyr Val Phe Gly Tyr Phe Gin Gly Val Phe Gly He Ala 
420 425 430 

Ser Leu Leu Arg Glu Gin Arg Pro Glu Gly Gly Asn Asn He Ser Val 
435 440 445 
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Gin Glu He Val Glu Gly Tyr Gin Ser Asn Asn Phe Pro Leu Glu Gly 
450 455 460 

Leu Ala Val Asp Val Asp Met Gin Gin Asp Leu Arg Val Phe Thr Thr 
465 470 475 480 

Lys He Glu Phe Trp Thr Ala Asn Lys Val Gly Thr Gly Gly Asp Ser 
485 490 495 

Asn Asn Lys Ser Val Phe Glu Trp Ala His Asp Lys Gly Leu Val Cys 
500 505 510 

Gin Thr Asn Val Thr Cys Phe Leu Arg Asn Asp Asn Gly Gly Ala Asp 
515 520 525 

Tyr Glu Val Asn Gin Thr Leu Arg Glu Lys Gly Leu Tyr Thr Lys Asn 
530 535 540 

Asp Ser Leu Thr Asn Thr Asn Phe Gly Thr Thr Asn Asp Gly Pro Ser 
545 550 555 560 

Asp Ala Tyr He Gly His Leu Asp Tyr Gly Gly Gly Gly Asn Cys Asp 
565 570 575 

Ala Leu Phe Pro Asp Trp Gly Arg Pro Gly Val Ala Glu Trp Trp Gly 
580 585 590 

Asp Asn Tyr Ser Lys Leu Phe Lys lie Gly Leu Asp Phe Val Trp Gin 
595 600 605 

Asp Met Thr Val Pro Ala Met Met Pro His Lys Val Gly Asp Ala Val 
610 615 620 

Asp Thr Arg Ser Pro Tyr Gly Trp Pro Asn Glu Asn Asp Pro Ser Asn 
625 630 635 640 

Gly Arg Tyr Asn Trp Lys Ser Tyr His Pro Gin Val Leu Val Thr Asp 
645 650 655 

Met Arg Tyr Glu Asn His Gly Arg Glu Pro Met Phe Thr Gin Arg Asn 
660 665 670 

Met His Ala Tyr Thr Leu Cys Glu Ser Thr Arg Lys Glu Gly He Val 
675 680 685 

Ala Asn Ala Asp Thr Leu Thr Lys Phe Arg Arg Ser Tyr He lie Ser 
690 695 700 

Arg Gly Gly Tyr He Gly Asn Gin His Phe Gly Gly Met Trp Val Gly 
705 710 715 720 

Asp Asn Ser Ser Ser Gin Arg Tyr Leu Gin Met Met He Ala Asn He 
725 730 735 

Val Asn Met Asn Met Ser Cys Leu Pro Leu Val Gly Ser Asp He Gly 
740 745 750 
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Gly Phe Thr Ser Tyr Asp Gly Arg Asn Val Cys Pro Gly Asp Leu Met 
755 760 765 

Val Arg Phe Val Gin Ala Gly Cys Leu Leu Pro Trp Phe Arg Asn His 
770 775 780 

Tyr Gly Arg Leu Val Glu Gly Lys Gin Glu Gly Lys Tyr Tyr Gin Glu 
785 790 795 800 

Leu Tyr Met Tyr Lys Asp Glu Met Ala Thr Leu Arg Lys Phe lie Glu 
805 810 815 

Phe Arg Tyr Arg Trp Gin Glu Val Leu Tyr Thr Ala Met Tyr Gin Asn 
820 825 830 

Ala Ala Phe Gly Lys Pro He He Lys Ala Ala Ser Met Tyr Asp Asn 
835 840 845 

Asp Arg Asn Val Arg Gly Ala Gin Asp Asp His Phe Leu Leu Gly Gly 
850 855 860 

His Asp Gly Tyr Arg He Leu Cys Ala Pro Val Val Trp Glu Asn Thr 
865 870 875 880 

Thr Ser Arg Asp Leu Tyr Leu Pro Val Leu Thr Lys Trp Tyr Lys Phe 
885 890 895 

Gly Pro Asp Tyr Asp Thr Lys Arg Leu Asp Ser Ala Leu Asp Gly Gly 
900 905 910 

Gin Met He Lys Asn Tyr Ser Val Pro Gin Ser Asp Ser Pro He Phe 
915 920 925 

Val Arg Glu Gly Ala He Leu Pro Thr Arg Tyr Thr Leu Asp Gly Ser 
930 935 940 

Asn Lys Ser Met Asn Thr Tyr Thr Asp Lys Asp Pro Leu Val Phe Glu 
945 950 955 960 

Val Phe Pro Leu Gly Asn Asn Arg Ala Asp Gly Met Cys Tyr Leu Asp 
965 970 975 

Asp Gly Gly He Thr Thr Asp Ala Glu Asp His Gly Lys Phe Ser Val 
980 985 990 

He Asn Val Glu Ala Leu Arg Lys Gly Val Thr Thr Thr He Lys Phe 
995 1000 1005 

Ala Tyr Asp Thr Tyr Gin Tyr Val Phe Asp Gly Pro Phe Tyr Val Arg 
1010 1015 1020 

He Arg Asn Leu Thr Thr Ala Ser Lys He Asn Val Ser Ser Gly Ala 
1025 1030 1035 1040 

Gly Glu Glu Asp Met Thr Pro Thr Ser Ala Asn Ser Arg Ala Al a Leu 
1045 1050 1055 



SUBSTITUTE SHEET (RULE 26) 



WO 95/10618 



PCT/EP94/03399 



37/1 



Phe Ser Asp Gly Gly Val Gly Glu Tyr Trp Ala Asp Asn Asp Thr Ser 
1060 1065 1070 

Ser Leu Trp Met Lys Leu Pro Asn Leu Val Leu Gin Asp Ala Val lie 
1075 1080 1085 

Thr He Thr 
1090 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



ATGTTTTCAA 


CCCTTGCGTT 


TGTCGCACCT 


AGTGCGCTGG 


GAGCCAGTAC 


CTTCGTAGGG 


60 


GCGGAGGTCA 


GGTCAAATGT 


TCGTATCCAT 


TCCGCTTTTC 


CAGCTGTGCA 


CACAGCTACT 


120 


CGCAAAACCA 


ATCGCCTCAA 


TGTATCCATG 


ACCGCATTGT 


CCGACAAACA 


AACGGCTACT 


180 


GCGGGTAGTA 


CAGACAATCC 


GGACGGTATC 


GACTACAAGA 


CCTACGATTA 


CGTCGGAGTA 


240 


TGGGGTTTCA 


GCCCCCTCTC 


CAACACGAAC 


TGGTTTGCTG 


CCGGCTCTTC 


TACCCCGGGT 


300 


GGCATCACTG 


ATTGGACGGC 


TACAATGAAT 


GTCAACTTCG 


ACCGTATCGA 


CAATCCGTCC 


360 


ATCACTGTCC 


AGCATCCCGT 


TCAGGTTCAG 


GTCACGTCAT 


ACAACAACAA 


CAGCTACAGG 


420 


GTTCGCTTCA 


ACCCTGATGG 


CCCTATTCGT 


GATGTGACTC 


GTGGGCCTAT 


CCTCAAGCAG 


480 


CAACTAGATT 


GGATTCGAAC 


GCAGGAGCTG 


TCAGAGGGAT 


GTGATCCCGG 


AATGACTTTC 


540 


ACATCAGAAG 


GTTTCTTGAC 


TTTTGAGACC 


AAGGATCTAA 


GCGTCATCAT 


CTACGGAAAT 


600 


TTCAAGACCA 


GAGTTACGAG 


AAAGTCTGAC 


GGCAAGGTCA 


TCATGGAAAA 


TGATGAAGTT 


660 


GGAACTGCAT 


CGTCCGGGAA 


CAAGTGCCGG 


GGATTGATGT 


TCGTTGATAG 


ATTATACGGT 


720 


AACGCTATCG 


CTTCCGTCAA 


CAAGAACTTC 


CGCAACGACG 


CGGTCAAGCA 


GGAGGGATTC 


780 


TATGGTGCAG 


GTGAAGTCAA 


CTGTAAGTAC 


CAGGACACCT 


ACATCTTAGA 


ACGCACTGGA 


840 


ATCGCCATGA 


CAAATTACAA 


CTACGATAAC 


TTGAACTATA 


ACCAGTGGGA 


CCTTAGACCT 


900 


CCGCATCATG 


ATGGTGCCCT 


CAACCCAGAC 


TATTATATTC 


CAATGTACTA 


CGCAGCACCT 


960 


TGGTTGATCG 


TTAATGGATG 


CGCCGGTACT 


TCGGAGCAGT 


ACTCGTATGG 


ATGGTTCATG 


1020 
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GACAATGTCT 


CTCAATCTTA 


CATGAATACT 


GGAGATACTA 


CCTGGAATTC 


TGGACAAGAG 


1080 


GACCTGGCAT 


ACATGGGCGC 


GCAGTATGGA 


CCATTTGACC 


AACATTTTGT 


TTACGGTGCT 


1140 


GGGGGTGGGA 


TGGAATGTGT 


GGTCACAGCG 


TTCTCTCTTC 


TACAAGGCAA - 


GGAGTTCGAG 


1200 


AACCAAGTTC 


TCAACAAACG 


TTCAGTAATG 


CCTCCGAAAT 


ACGTCTTTGG 


TTTCTTCCAG 


1260. 


GGTGTTTTCG 


GGACTTCTTC 


CTTGTTGAGA 


GCGCATATGC 


CAGCAGGTGA 


GAACAACATC 


1320 


TCAGTCGAAG 


AAATTGTAGA 


AGGTTATCAA 


AACAACAATT 


TCCCTTTCGA 


GGGGCTCGCT 


1380 


GTGGACGTGG 


ATATGCAAGA 


CAACTTGCGG 


GTGTTCACCA 


CGAAGGGCGA 


ATTTTGGACC 


1440 


GCAAACAGGG 


TGGGTACTGG 


CGGGGATCCA 


AACAACCGAT 


CGGTTTTTGA 


ATGGGCACAT 


1500 


GACAAAGGCC 


TTGTTTGTCA 


GACAAATATA 


ACTTGCTTCC 


TGAGGAATGA 


TAACGAGGGG 


1560 


CAAGACTACG 


AGGTCAATCA 


GACGTTAAGG 


GAGAGGCAGT 


TGTACACGAA 


GAACGACTCC 


1620 


CTGACGGGTA 


CGGATTTTGG 


AATGACCGAC 


GACGGCCCCA 


GCGATGCGTA 


CATCGGTCAT 


1680 


CTGGACTATG 


GGGGTGGAGT 


AGAATGTGAT 


GCACTTTTCC 


CAGACTGGGG 


ACGGCCTGAC 


1740 


GTGGCCGAAT 


GGTGGGGAAA 


TAACTATAAG 


AAACTGTTCA 


GCATTGGTCT 


CGACTTCGTC 


1800 


TGGCAAGACA 


TGACTGTTCC 


AGCAATGATG 


CCGCACAAAA 


TTGGCGATGA 


CATCAATGTG 


1860 


AAACCGGATG 


GGAATTGGCC 


GAATGCGGAC 


GATCCGTCCA 


ATGGACAATA 


CAACTGGAAG 


1920 


ACGTACCATC 


CCCAAGTGCT 


TGTAACTGAT 


ATGCGTTATG 


AGAATCATGG 


TCGGGAACCG 


1980 


ATGGTCACTC 


AACGCAACAT 


TCATGCGTAT 


ACACTGTGCG 


AGTCTACTAG 


GAAGGAAGGG 


2040 


ATCGTGGAAA 


ACGCAGACAC 


TCTAACGAAG 


TTCCGCCGTA 


GCTACATTAT 


CAGTCGTGGT 


2100 


GGTTACATTG 


GTAACCAGCA 


TTTCGGGGGT 


ATGTGGGTGG 


GAGACAACTC 


TACTACATCA 


2160 


AACTACATCC 


AAATGATGAT 


TGCCAACAAT 


ATTAACATGA 


ATATGTCTTG 


CTTGCCTCTC 


2220 


GTCGGCTCCG 


ACATTGGAGG 


ATTCACCTCA 


TACGACAATG 


AGAATCAGCG 


AACGCCGTGT 


2280 


ACCGGGGACT 


TGATGGTGAG 


GTATGTGCAG 


GCGGGCTGCC 


TGTTGCCGTG 


GTTCAGGAAC 


2340 


CACTATGATA 


GGTGGATCGA 


GTCCAAGGAC 


CACGGAAAGG 


ACTACCAGGA 


GCTGTACATG 


2400 


TATCCGAATG 


AAATGGATAC 


GTTGAGGAAG 


TTCGTTGAAT 


TCCGTTATCG 


CTGGCAGGAA 


2460 


GTGTTGTACA 


CGGCCATGTA 


CCAGAATGCG 


GCTTTCGGAA 


AGCCGATTAT 


CAAGGCTGCT 


2520 


TCGATGTACA 


ATAACGACTC 


AAACGTTCGC 


AGGGCGCAGA 


ACGATCATTT 


CCTTCTTGGT 


2580 


GGACATGATG 


GATATCGCAT 


TCTGTGCGCG 


CCTGTTGTGT 


GGGAGAATTC 


GACCGAACGC 


2640 


GAATTGTACT 


TGCCCGTGCT 


GACCCAATGG 


TACAAATTCG 


GTCCCGACTT 


TGACACCAAG 


2700 
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CCTCTGGAAG GAGCGATGAA CGGAGGGGAC CGAATTTACA ACTACCCTGT ACCGCAAAGT 2760 

GAATCACCAA TCTTCGTGAG AGAAGjGTGCG ATTCTCCCTA CCCGCTACAC GTTGAACGGT 2820 

GAAAACAAAT CATTGAACAC GTACACGGAC GAAGATCCGT TGGTGTTTGA AGTATTCCCC 2880 

CTCGGAAACA ACCGTGCCGA CGGTATGTGT TATCTTGATG ATGGCGGTGT GACCACCAAT 2940 

GCTGAAGACA ATGGCAAGTT CTCTGTCGTC AAGGTGGCAG CGGAGCAGGA TGGTGGTACG 3000 

GAGACGATAA CGTTTACGAA TGATTGCTAT GAGTACGTTT TCGGTGGACC GTTCTACGTT 3060 

CGAGTGCGCG GCGCTCAGTC GCCGTCGAAC ATCCACGTGT CTTCTGGAGC GGGTTCTCAG 3120 

GACATGAAGG TGAGCTCTGC CACTTCCAGG GCTGCGCTGT TCAATGACGG GGAGAACGGT 3180 

GATTTCTGGG TTGACCAGGA GACAGATTCT CTGTGGCTGA AGTTGCCCAA CGTTGTTCTC 3240 

CCGGACGCTG TGATCACAAT TACCTAA 3267 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3276 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



ATGTATCCAA 


CCCTCACCTT 


CGTGGCGCCT 


AGTGCGCTAG 


GGGCCAGAAC 


TTTCACGTGT 


60 


GTGGGCATTT 


TTAGGTCACA 


CATTCTTATT 


CATTCGGTTG 


TTCCAGCGGT 


GCGTCTAGCT 


120 


GTGCGCAAAA 


GCAACCGCCT 


CAATGTATCC 


ATGTCCGCTT 


TGTTCGACAA 


ACCGACTGCT 


180 


GTTACTGGAG 


GGAAGGACAA 


CCCGGACAAT 


ATCAATTACA 


CCACTTATGA 


CTACGTCCCT 


240 


GTGTGGCGCT 


TCGACCCCCT 


CAGCAATACG 


AACTGGTTTG 


CTGCCGGATC 


TTCCACTCCC 


300 


GGCGATATTG 


ACGACTGGAC 


GGCGACAATG 


AATGTGAACT 


TCGACCGTAT 


CGACAATCCA 


360 


TCCTTCACTC 


TCGAGAAACC 


GGTTCAGGTT 


CAGGTCACGT 


CATACAAGAA 


CAATTGTTTC 


420 


AGGGTTCGCT 


TCAACCCTGA 


TGGTCCTATT 


CGCGATGTGG 


ATCGTGGGCC 


TATCCTCCAG 


480 


CAGCAACTAA ATTGGATCCG 


GAAGCAGGAG 


CAGTCGAAGG 


GGTTTGATCC 


TAAGATGGGC 


540 


TTCACAAAAG 


AAGGTTTCTT 


GAAATTTGAG 


ACCAAGGATC 


TGAACGTTAT 


CATATATGGC 


600 


AATTTTAAGA 


CTAGAGTTAC 


GAGGAAGAGG 


GATGGAAAAG 


GGATCATGGA 


GAATAATGAA 


660 
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GTGCCGGCAG 


GATCGTTAGG 


GAACAAGTGC 


CGGGGATTGA 


TGTTTGTCGA 


CAGGTTGTAC 


720 


GGCACTGCCA 


TCGCTTCCGT 


TAATGAAAAT 


TACCGCAACG 


ATCCCGACAG 


GAAAGAGGGG 


780 


TTCTATGGTG 


CAGGAGAAGT 


AAACTGCGAG 


TTTTGGGACT 


CCGAACAAAA 


CAGGAACAAG 


840 


TACATCTTAG 


AACGAACTGG 


AATCGCCATG 


ACAAATTACA 


ATTATGACAA 


CTATAACTAC 


900 


AACCAGTCAG 


ATCTTATTGC 


TCCAGGATAT 


CCTTCCGACC 


CGAACTTCTA 


CATTCCCATG 


960 


TATTTTGCAG 


CACCTTGGGT 


AGTTGTTAAG 


GGATGCAGTG 


GCAACAGCGA 


TGAACAGTAC 


1020 


TCGTACGGAT 


GGTTTATGGA 


TAATGTCTCC 


CAAACTTACA 


TGAATACTGG 


TGGTACTTCC 


1080 


TGGAACTGTG 


GAGAGGAGAA 


CTTGGCATAC 


ATGGGAGCAC 


AGTGCGGTCC 


ATTTGACCAA 


1140 


CATTTTGTGT 


ATGGTGATGG 


AGATGGTCTT 


GAGGATGTTG 


TCCAAGCGTT 


CTCTCTTCTG 


1200 


CAAGGCAAAG 


AGTTTGAGAA 


CCAAGTTCTG 


AACAAACGTG 


CCGTAATGCC 


TCCGAAATAT 


1260 


GTGTTTGGTT 


ACTTTCAGGG 


AGTCTTTGGG 


ATTGCTTCCT 


TGTTGAGAGA 


GCAAAGACCA 


1320 


GAGGGTGGTA 


ATAACATCTC 


TGTTCAAGAG 


ATTGTCGAAG 


GTTACCAAAG 


CAATAACTTC 


1380 


CCTTTAGAGG 


GGTTAGCCGT 


AGATGTGGAT 


ATGCAACAAG 


ATTTGCGCGT 


GTTCACCACG 


1440 


AAGATTGAAT 


TTTGGACGGC 


AAATAAGGTA 


GGCACCGGGG 


GAGACTCGAA 


TAACAAGTCG 


1500 


GTGTTTGAAT 


GGGCACATGA 


CAAAGGCCTT 


GTATGTCAGA 


CGAATGTTAC 


TTGCTTCTTG 


1560 


AGAAACGACA 


ACGGCGGGGC 


AGATTACGAA 


GTCAATCAGA 


CATTGAGGGA 


GAAGGGTTTG 


1620 


TACACGAAGA 


ATGACTCACT 


GACGAACACT 


AACTTCGGAA 


CTACCAACGA 


CGGGCCGAGC 


1680 


GATGCGTACA 


TTGGACATCT 


GGACTATGGT 


GGCGGAGGGA 


ATTGTGATGC 


ACTTTTCCCA 


1740 


GACTGGGGTC 


GACCGGGTGT 


GGCTGAATGG 


TGGGGTGATA 


ACTACAGCAA 


GCTCTTCAAA 


1800 


ATTGGTCTGG 


ATTTCGTCTG 


GCAAGACATG 


ACAGTTCCAG 


CTATGATGCC 


ACACAAAGTT 


1860 


GGCGACGCAG 


TC GAT AC GAG 


ATCACCTTAC 


GGCTGGCCGA 


ATGAGAATGA 


TCCTTCGAAC 


1920 


GGACGATACA 


ATTGGAAATC 


TTACCATCCA 


CAAGTTCTCG 


TAACTGATAT 


GCGATATGAG 


1980 


AATCATGGAA 


GGGAACCGAT 


GTTCACTCAA 


CGCAATATGC 


ATGCGTACAC 


ACTCTGTGAA 


2040 


TCTACGAGGA 


AGGAAGGGAT 


TGTTGCAAAT 


GCAGACACTC 


TAACGAAGTT 


CCGCCGCAGT 




TATATTATCA 


GTCGTGGAGG 


TTACATTGGC 


AACCAGCATT 


TTGGAGGAAT 


GTGGGTTGGA 


2160 


GACAACTCTT 


CCTCCCAAAG 


ATACCTCCAA 


ATGATGATCG 


CGAACATCGT 


CAACATGAAC 


2220 


ATGTCTTGCC 


TTCCACTAGT 


TGGGTCCGAC 


ATTGGAGGTT 


TTACTTCGTA 


TGATGGACGA 


2280 


AACGTGTGTC 


CCGGGGATCT 


AATGGTAAGA 


TTCGTGCAGG 


CGGGTTGCTT 


ACTACCGTGG 


2340 
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TTCAGAAACC 


ACTATGGTAG 


GTTGGTCGAG 


GGCAAGCAAG 


AGGGAAAATA 


CTATCAAGAA 


2400 


CTGTACATGT 


ACAAGGACGA 


GATGGCTACA 


TTGAGAAAAT 


TCATTGAATT 


CCGTTACCGC 


2460 


TGGCAGGAGG 


TGTTGTACAC 


TGCTATGTAC 


CAGAATGCGG 


CTTTCGGGAA 


ACCGATTATC 


2520 


AAGGCAGCTT 


CCATGTACGA 


CAACGACAGA 


AACGTTCGCG 


GCGCACAGGA 


TGACCACTTC 


2580 


CTTCTCGGCG 


GACACGATGG 


ATATCGTATT 


TTGTGTGCAC 


CTGTTGTGTG 


GGAGAATACA 


2640 


ACCAGTCGCG 


ATCTGTACTT 


GCCTGTGCTG 


ACCAAATGGT 


ACAAATTCGG 


CCCTGACTAT 


2700 


GACACCAAGC 


GCCTGGATTC 


TGCGTTGGAT 


GGAGGGCAGA 


TGATTAAGAA 


CTATTCTGTG 


2760 


CCACAAAGCG 


ACTCTCCGAT 


ATTTGTGAGG 


GAAGGAGCTA 


TTCTCCCTAC 


CCGCTACACG 


2820 


TTGGACGGTT 


CGAACAAGTC 


AATGAACACG 


TACACAGACA 


AAGACCCGTT 


GGTGTTTGAG 


2880 


GTATTCCCTC 


TTGGAAACAA 


CCGTGCCGAC 


GGTATGTGTT 


ATCTTGATGA 


TGGCGGTATT 


2940 


ACTACAGATG 


CTGAGGACCA 


TGGCAAATTC 


TCTGTTATCA 


ATGTCGAAGC 


CTTACGGAAA 


3000 


GGTGTTACGA 


CGACGATCAA 


GTTTGCGTAT 


GACACTTATC 


AATACGTATT 


TGATGGTCCA 


"5060 

JUUu 


TTCTACGTTC 


GAATCCGTAA 


TCTTACGACT 


GCATCAAAAA 


TTAACGTGTC 


TTCTGGAGCG 


3120 


GGTGAAGAGG 


ACATGACACC 


GACCTCTGCG 


AACTCGAGGG 


CAGCTTTGTT 


CAGTGATGGA 


3180 


GGTGTTGGAG 


AATACTGGGC 


TGACAATGAT 


ACGTCTTCTC 


TGTGGATGAA 


GTTGCCAAAC 


3240 


CTGGTTCTGC 


AAGACGCTGT 


GATTACCATT 


ACGTAG 






3276 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Tyr Arg Trp Gin Glu Val Leu Tyr Thr Ala Met Tyr Gin Asn Ala Ala 
15 10 15 

Phe Gly Lys Pro lie lie Lys Ala Ala Ser Met Tyr Asn Asn Asp Ser 
20 25 30 

Asn Val Arg Arg Ala Gin Asn Asp His Phe Leu Leu Gly Gly His Asp 
35 40 45 
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Gly Tyr Arg lie Leu Cys Ala Pro Val Val Trp Glu Asn Ser Thr Glu 
50 55 60 

Arg Glu Leu Tyr Leu Pro Val Leu Thr Gin Trp Tyr Lys Phe Gly Pro 
65 70 75 80 

Asp Phe Asp Thr Lys Pro Leu Glu Gly Ala 
85 90 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: mi sc di f ference 

(B) LOCATION: replace(6, "") 

(D) OTHER INFORMATION: /note= "N is T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_dif ference 

(B) LOCATION: replace(9, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: misc_difference 

(B) LOCATION: replace(12, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: mi scdi f ference 

(B) LOCATION: replace(15, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: mi scdi f ference 

(B) LOCATION: replace(18, "") 

(D) OTHER INFORMATION: /note= "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_d i f ference 

(B) LOCATION: replace(21, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
ATGTANAANA ANGANTCNAA NGT 23 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: mi scdifference 

(B) LOCATION: replace (6, "") 

(D) OTHER INFORMATION: /note= "N is T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_difference 

(B) LOCATION: replace(9, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: miscdifference 

(B) LOCATION: replace(12, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_difference 

(B) LOCATION: replace(15, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_difference 

(B) LOCATION: replace(18, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(ix) FEATURE: 

(A) NAME/KEY: miscdifference 

(B) LOCATION: replace(21, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATGTANAANA ANGANAGNAA NGT 23 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/KEY: mi sc difference 

(B) LOCATION: replace(3, "") 

(D) OTHER INFORMATION: /note= "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: miscdifference 

(B) LOCATION: replace(6, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: miscdifference 

(B) LOCATION: replace(9, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: miscdifference 

(B) LOCATION: replace(12, "") 

(D) OTHER INFORMATION: /note= "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc difference 

(B) LOCATION: replace(15, "") 

(D) OTHER INFORMATION: /note= "N is G or A or T or C" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TANCCNTCNT GNCCNCC 17 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: miscdifference 

(B) LOCATION: replace (3, "") 

(D) OTHER INFORMATION: /note= "N is G or A or T or C 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: replace (6, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: misc_difference 

(B) LOCATION: replace(9, "") 

(D) OTHER INFORMATION: /note= "N is C or T" 
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(ix) FEATURE: 

(A) NAME/KEY: mi sc_di f ference 

(B) LOCATION: reRlace(12, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_dif ference 

(B) LOCATION: replace(18, "") 

(D) OTHER INFORMATION: /note- "N is C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGNCCNAANT TNTACCANTG 20 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: mi sc_di f ference 

(B) LOCATION: replace(3, "") 

(D) OTHER INFORMATION: /note= "N is T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_di f ference 

(B) LOCATION: replace(6, "") 

(D) OTHER INFORMATION: /note= "N is G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi scdi f ference 

(B) LOCATION: replace(12, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: mi scdi f ference 

(B) LOCATION: replace(15, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TANCGNTGGC ANGANGT 17 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: mi sc di fference 

(B) LOCATION: replace(3, "") 

(D) OTHER INFORMATION: /note= "N is T or C" 

(ix) FEATURE: 

(A) NAME/KEY: mi scdi fference 

(B) LOCATION: replace (6, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME/KEY: mi scdi fference 

(B) LOCATION: replace (12, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(ix) FEATURE: 

(A) NAME /KEY: mi scdi fference 

(B) LOCATION: replace(15, "") 

(D) OTHER INFORMATION: /note= "N is G or A" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TANAGNTGGC ANGANGT 17 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
ATGTACAACA ACGACTCGAA CGTTCGCAGG GCGCAGAACG ATCATTTCCT TCTTGGCGGC 60 
CACGACGGTT A 71 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Tyr Asn Asn Asp Ser Asn Val Arg Arg Ala Gin Asn Asp His Phe 
1 5 10 15 

Leu Leu Gly Gly His Asp Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATGTACAACA ACGACTCGAA CGTTCGCAGG GCGCAGAACG ATCATTTCCT TCTTGGTGGA 60 

CATGATGGAT ATCGCATTCT GTGCGCGCCT GTTGTGTGGG AGAATTCGAC CGAACGGAAT 120 

TGTACTTGCC CGTGCTGACC CAATGGTACA AATTCGGCCC 160 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Tyr Asn Asn Asp Ser Asn Val Arg Arg Ala Gin Asn Asp His Phe 
15 10 15 
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Leu Leu Gly Gly His Asp Gly Tyr Arg He Leu Cys Ala Pro Val Val 
20 25 30 

Trp Glu Asn Ser Thr Glu Arg Glu Leu Tyr Leu Pro Val Leu Thr Gin 
35 40 45 

Trp Tyr Lys Phe Gly Pro 
50 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 238 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TACAGGTGGC AGGAGGTGTT GTACACTGCT ATGTACCAGA ATGCGGCTTT CGGGAAACCG 60 

ATTATCAAGG CAGCTTCCAT GTACGACAAC GACAGAAACG TTCGCGGCGC ACAGGATGAC 120 

CACTTCCTTC TCGGCGGACA CGATGGATAT CGTATTTTGT GTGCACCTGT TGTGTGGGAG 180 

AATACAACCA GTCGCGATCT GTACTTGCCT GTGCTGACCA GTGGTACAAA TTCGGCCC 238 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 79 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Tyr Arg Trp Gin Glu Val Leu Tyr Thr Ala Met Tyr Gin Asn Ala Ala 
15 10 15 

Phe Gly Lys Pro He lie Lys Ala Ala Ser Met Tyr Asp Asn Asp Arg 
20 25 30 

Asn Val Arg Gly Ala Gin Asp Asp His Phe Leu Leu Gly Gly His Asp 
35 40 45 

Gly Tyr Arg He Leu Cys Ala Pro Val Val Trp Glu Asn Thr Thr Ser 
50 55 60 
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Arg Asp Leu Tyr Leu Pro Val Leu Thr Lys Trp Tyr Lys Phe Gly 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GCTCTAGAGC ATGTTTTCAA CCCTTGCG 28 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AGCTTGTTAA CATGTATCCA ACCCTCACCT TCGTGG 36 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
ACAATTGTAC ATAGGTTGGG AGTGGAAGCA CCGC 34 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRuIe I3bis) 



A. The indications made below relate to the microorganism referred to in the description 
on page ^ , line 22- 



B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet | | 



Name of depositary institution 

The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 



Address of depositary institution (including postal code and country) 

23 St. Machar Drive 
Aberdeen 
Scotland 
AB2 1RY 

United Kingdom 



Date of deposit 


Accession Number 


C* ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet | | 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample, (Rule 28(A) 
EPC). 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the International Bureau later (specify die general nature oj the indications e.g., 'Accession 
Number of Deposit m ) 



For receiving Office use only 



1 I This sheet was received with the international application 



Authorized officer 



For International Bureau use only 



I | This sheet was received by the International Bureau on: 



Authorized officer 



Form PCT/RO/134(July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRuIe 13bis) 



A. The indications made below relate to the microorganism referred to in the description 
on page £> f n n e Q*M* 



D. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet [~] 



Name of depositary institution 
The National Collections of Industrial and Marine Bacteria Limited (NCIMB) 



Address of depositary institution ftncluding postal code and country) 

23 St. Machar Drive 

Aberdeen 

Scotland 

AB2 1RY 



Date of deposit 


Accession Number 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet f~ J 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
mxcroorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample, (Rule 28(4) 
EPC) . 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (If the indications arc not for all designed Stales) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the International Bureau later (specify the general nature of the indications e.g., 'Accession 
rt umber of Deposit 7 



For receiving Office use only 



l~ | This sheet was received with the international application 



Authorized officer 



Form PCT/RO/134 (July 1992) 



For International Bureau use only 



I I This sheet was received by the International Bureau on: 



Authorized officer 



WO 95/10618 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRuIc \3bis) 



A. The indications made belowrelate to the microorganism referred to in the description 



n page 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are Identified on an additional sheet | | 



Name of depositary institution 

Culture Collection of Algae and Protozoa (CCAP) 



Address of depositary institution (including postal code and country) 

Dunstaffnage Marine Laboratory 

P.O. Box 3 

Oban 

Argyll PA34 4AD 



Date of deposit 


Accession Number . 

e.cftf> \TI%I\ 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet f~l 



In respect of those designations in which a European patent is sought, and any 
other designated state having equivalent legislation, a sample of the deposited 
microorganism will be made available until the publication of the mention of the 
grant of the European patent or until the date on which the application has been 
refused or withdrawn or is deemed to be withdrawn, only by the issue of such a 
sample to an expert nominated by the person requesting the sample. (Rule 28(A) 
EPC) . 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE Of V indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



W ° W Wi " ^ SUbm ' ltCd l ° lnternationaI Burau ^(specifyUtegcneratnaturcoftheindicationse.g., 'Accession 



For receiving Office use only 



This sheet was received with the international application 



Authorized officer 



Form PCIYRO/134 (July 1992) 



For International Bureau use only 



I I This sheet was received by the Internationa] Bureau < 



Authorized officer 
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CLAIMS 

1. A method of preparing the enzyme a-l,4-glucan lyase (GL) comprising isolating 
the enzyme from a fungally infected algae. 

5 

2. A method according to claim 1 wherein the enzyme is isolated and/or further 
purified using a gel that is not degraded by the enzyme. 

3. A method according to claim 2 wherein the gel is based on dextrin or derivatives 
10 thereof, preferably a cyclodextrin, more preferably beta-cyclo-dextrin. 

4. A GL enzyme prepared by the method according to any one of claims 1 to 3. 

5. An enzyme comprising the amino acid sequence SEQ. ID. No. 1. or SEQ. ID. 
15 No. 2, or any variant thereof. 

6. A nucleotide sequence coding for the enzyme a-l,4-glucan lyase. 

7. A nucleotide sequence according to claim 6 wherein the sequence is a DNA 
2 0 sequence. 

8. A nucleotide sequence according to claim 7 wherein the DNA sequence comprises 
a sequence that is the same as, or is complementary to, or has substantial homology 
with, or contains any suitable codon substitutions for any of those of, SEQ. ID. No. 

25 3 or SEQ. ID. No. 4. 

9. A method of preparing the enzmye a-l,4-glucan lyase comprising expressing the 
nucleotide sequence according to any one of claims 6 to 8. 



30 
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10. A method according to any one of the preceding claims wherein the algae is 
Gracilariopsis lemaneiformis. 

11. The use of beta-cyclodextrin to purify an enzyme, preferably GL. 

5 

12. A nucleotide sequence wherein the DNA sequence comprises a sequence that is 
the same as, or is complementary to, or has substantial homology with, or contains 
any suitable codon substitutions for any of those of, SEQ. ID, No. 3 or SEQ. ID. 
No. 4. 
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Fig.K Calcoftour White stainings revealing fungi in upper part and lower part of Graei- 
laria lemnaeformis. (108x and 294x). 



4 




Fig. 2. PAS / Anilinblue Black staining of Gracilaria lemnaeformis with fungi. 
The fungi have a significant higher content of carbohydrates. 
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Fig.4. The antiscnse detections with clone 2 probe (upper row) are restricted to the rung? 
illustrated by the Calcoftour White staining of the succeeding, section (lower row) (46x 
and lOSx). 
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Fig.5. Intense antisense detections with clone 2 probe are found over the fungi in 
Gracilaria lemnaeformis (294x). 
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FIGURE 8 

MFSTLAFVAP SALGASTFVG AEVRSNVRIH SAFPAVHTAT RKTNRLNVSM TALSDKQTAT 
AGSTDNPDGI DYKTYDYVGV WGFSPLSNTN WFAAGSSTPG GITDWTATMN VNFDRIDNPS 
ITVQHPVQVQ VTSYNNNSYR VRFNPDGPIR DVTRGPILKQ QLDWIRTQEL SEGCDPGMTF 
TSEGFLTFET KDLSVIIYGN FKTRVTRKSD GKVIMENDEV GTAS SGNKCR GLMFVDRLYG 
NAIASVN KNF RNDAVKOEGF YGAGEVNCK Y QDTYILERTG IAMTNYNYDN LNYNQWDLRP 
PHHDGALNPD YYIPMYYAAP WLIVNGCAGT SEQYSYGWFM DNVSQSYMNT GDTTWNSGQE 
DLAYMGAQYG PFDQHFVYGA GGGMECWTA FSLLQGKEFE NQVLNKRSVM PPKYVFGFFQ 
GVFGTSSLLR AHMPAGENNI SVEEIVEGYQ NNNFPFEGLA VDVDMQDNLR VFTTKGEFWT 
ANRVGTGGDP NNRSVFE WAH DK GLVCQTNI TCFLRNDNEG ODYEVNOTLR ER OLYTKNDS 
LTGTDFGMTD DGPSDAYIGH LDYGGGVECD ALFPDWGRPD VAEWWGNNYK KLFSIGLDFV 
WQDMTVPAMM PH KIGDDINV KPDGNWPNAD DPS NGOYNWK TYHP OVLVTD MRYENHGREP 
MVTORN IHAY TLCESTRKEG IVENADTLTK FRRSYIISRG GYIGNQHFGG MWVGDNSTTS 
NYIQMMIANN INMNMSCLPL VGSDIGGFTS YDNENQRTPC TGDLM VRYVO AGCLLPWFR N 
HYDRWIESKD HGKDYQELYM YPNEMDTLRK FVEF RYRWOE VLYTAMYONA AFGKPIIKAA 
SMYNNDSNVR RAONDHFLLG GHDGYRILCA PWWENSTER ELYLPVLTOW YKFGPDFDTK 
PLEGAM NGGD RIYNYPVPQS ESPIFVREGA I LPTRYTLNG ENKSLNTYTD EDPLVFEVFP 
LGNNRADGMC YLDDGGVTTN AEDNGKFSW KVAAEODGGT ETITFTNDCY EYVFGG PFYV 
RVRGAQSPSN IHVSSGAGSQ DMKVSSATSR AALFNDGENG DFWVDOETDS LWLKLPNWL 
PDAVITIT 
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FIGURE 
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FIGURE 9 continued 
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Figure |ft Microphotograph of a fungal hypha (0 growing between algal cell walls (w). 
Note grains of floridean starch (s) and thylakoids (arrows) in the algal cell. 
Bar = 2 /im. 



